A Performance Model for Unified Parallel C

نویسنده

  • Zhang Zhang
چکیده

This research is a performance centric investigation of the Unified Parallel C (UPC), a parallel programming language that belong to the Partitioned Global Address Space (PGAS) language family. The objective is to develop performance modeling methodology that targets UPC but can be generalized for other PGAS languages. The performance modeling methodology relies on platform characterization and program characterization, achieved through shared memory benchmarking and static code analysis, respectively. Models built using this methodology can predict the performance of simple UPC application kernels with relative errors below 15%. Beside performance prediction, this work provide a framework based on shared memory benchmarking and code analysis for platform evaluation and compiler/runtime optimization study. A few platforms are evaluated in terms of their fitness to UPC computing. Some optimization techniques, such as remote reference caching, is studied using this framework. A UPC implementation, MuPC, is developed along with the performance study. MuPC consists of a UPC-to-C translator built upon a modified version of the EDG C/C++ front end and a runtime system built upon MPI and POSIX threads. MuPC performance features include a runtime software cache for remote accesses and low latency access to shared memory with affinity to the issuing thread. In this research, MuPC serves as a platform that facilitates the development, testing, and validation of performance microbenchmarks and optimization techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Evaluation of Unified Parallel C for Molecular Dynamics

Partitioned Global Address Space (PGAS) integrates the concepts of shared memory programming and the control of data distribution and locality provided by message passing into a single parallel programming model. The purpose of allying distributed data with shared memory is to cultivate a locality-aware shared memory paradigm. PGAS is comprised of a single shared address space, which is partiti...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

Efficiency evaluation of wheat farming: a network data envelopment analysis approach

Traditional data envelopment analysis (DEA) models deal with measurement of relative efficiency of decision making units (DMUs) in which multiple-inputs consumed to produce multiple-outputs. One of the drawbacks of these models is neglecting internal processes of each system, which may have intermediate products and/or independent inputs and/or outputs. In this paper some methods which are usab...

متن کامل

Unified Parallel C Profiling Interface Proposal

Due to the wide range of compilers and the lack of a standardized profiling interface, writers of performance tools face many challenges when incorporating support for Unified Parallel C (UPC) programs. This document presents a preliminary specification for a standard profiling interface that attempts to be flexible enough to be adapted into current UPC compiler and runtime infrastructures with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007